Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Type Annotations for SymPy #4

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

sylee957
Copy link
Member

No description provided.

@certik
Copy link
Member

certik commented Aug 25, 2023

Timely. I think this can be adopted (and enforced with mypy) gradually.

I would like to get LPython to the point that SymPy could use it, but we are not there yet. Here is what works today: https://github.com/lcompilers/lpython/blob/615e7e7d6baa032ab29b8b5f3999dba0e7e6640b/integration_tests/symbolics_05.py, this gets compiled to almost maximum performance code, using SymEngine as the backend.

For now we settled on using just "S" as the type, since it's short, and it will be used a lot, for every variable, etc.

If you have ideas or want to collaborate on this, let me know.

@moorepants
Copy link
Member

Is this the entirety of the proposal or it is a draft? I ask because it does not seem to address any of the issues that have been brought up about adding typing to SymPy nor does it outline any technical considerations about how this would be done or any policies on exactly what typing will and will not be in SymPy. The proposal even seems to imply removing run time type checks, which will also need much more explanation. There are no references to the many past discussions on the topic. My feedback at this point, is that the proposal needs much more content and specifics.

@oscargus
Copy link

Just want to chip in that the use of .pyi-files should be considered as an alternative to bloating editing the existing files.

@oscarbenjamin
Copy link
Contributor

Just want to chip in that the use of .pyi-files should be considered

They should be considered. Perhaps the SymPEP should spell out what that would look like and clarify whether there would be any "policy" about it.

I prefer the type annotations to be inline when the types are good but some SymPy functions do not exactly have types that can be expressed succinctly.

I think it is worth noting that there are two different things that type hints can be used for:

  1. Static type checking within the codebase and downstream codebases
  2. Auto completion etc in editors

My impression is that actually autocompletion is the most frequent use of type hints. Basically if SymPy exposes type hints somehow (inline or pyi files) then editors like vscode etc will use that to help users write code that uses SymPy. I estimate that probably the majority of SymPy users probably do use editors that would do this.

See also
sympy/sympy#25103

SymPEP-XXXX.md Outdated Show resolved Hide resolved
SymPEP-XXXX.md Outdated Show resolved Hide resolved
SymPEP-XXXX.md Outdated

It's important to note that tools like mypy and pyright are capable of inferring types, facilitating the incorporation of static typing without a steep learning curve. By adding a few annotations, code can be made cleaner and the transition to static typing across the entire codebase can be eased.

For functions, classes, and modules internally used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted. This transition can help eliminate these unnecessary checks and subsequently enhance the overall performance of the SymPy codebase.
Copy link
Member

@moorepants moorepants Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who deems these checks unnecessary? If we want a function/method to be duck-typed, I believe we should be allowed to do so. Duck-typing has been the design paradigm preached by the Python community since I can remember and it is a feature, not a bug.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should not drop dynamic type checks because of adding static typing. When not using an editor supporting this (you have no idea how many students edit their code in whatever the system offers), there should still be sensible error messages, not the code crashing later with a seemingly unrelated error message.

If nothing else, this will probably reduce the number of issues opened that boils down to the user doing something wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks are necessary in user-facing functions but should not be necessary in much of the internal code if a static checker can verify correctness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note also that if the type hints are used consistently on user facing functions then an editor like vscode will already warn the user about passing the wrong type into a function before the code even runs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python allows to inspect type annotations in runtime

def square(x: int):
    return x * x

square.__annotations__
# {'x': int}

such that we can implement something like @validate such that

# Function for users that raises on isinstance(x, int)
@validate
def square(x: int):
    return x * x

# Function for internal use
def square(x: int):
    return x * x

So even if we want to do runtime validation,
It should be better done like above, than repeating isinstance checks manually,
which often makes the code more complicated with validation code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree. Just wanted to point out that not all people use editors with type hinting (and/or do not understand how to benefit from them).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not all people use editors with type hinting

Agreed. I make very heavy use of ipython for example which does not support this.

Note though that the sentence in the PEP that we are all commenting on here explicitly says (emphasis added):

For functions, classes, and modules INTERNALLY used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted.

The example here is a good one:
https://github.com/sympy/sympy/blob/b0dcb5af49e7289680d9789d292197675e40490d/sympy/polys/polyclasses.py#L167-L171
In gh-25651 I found bugs by enabling that check. The check is too expensive to be used at runtime though so it was commented out before merge. Almost everything that it checks is something that could be verified by a static type checker but doing that check at runtime means recursively calling isinstance down through a potentially large data structure. The class in question is purely internal and not something that any ordinary SymPy user would ever see directly.

Note also that just attempting to verify that the codebase does indeed only use those types in this particular function is something that consumed development time. The whole PR gh-25651 would be completely unnecessary if the code in question just used type hints but as it is if I want to verify what the types are than I have to add runtime checks and ensure that the entire CI test suite completes successfully (and then fix the bugs that show up).

SymPEP-XXXX.md Outdated

For functions, classes, and modules internally used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted. This transition can help eliminate these unnecessary checks and subsequently enhance the overall performance of the SymPy codebase.

Similarly, functions, classes, and modules within SymPy that intentionally avoided runtime type checks for performance reasons should consider embracing static typing. Static typing undoubtedly mitigates performance overhead while providing better bug detection related to type errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potentially compelling statement. If there were some real world metrics that show how much a large codebase like SymPy could be sped up by removing all runtime checks and moving to a linting-based type checking procedure, then that would help convince us of any need to do such a radical change.

Copy link
Member

@moorepants moorepants Aug 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand though, the speed bottlenecks in SymPy are not due to large numbers of runtime checks but due to inefficient symbolic algorithms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be some discussions about how checking list[list[int]] in runtime could be avoided in polynomial systems, and can be replaced by static typing.

https://github.com/sympy/sympy/blob/b0dcb5af49e7289680d9789d292197675e40490d/sympy/polys/polyclasses.py#L167-L171

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @sylee957 notes it is not just about speed bottlenecks because in many cases the checking is not done to avoid the slowdowns that it would cause. Then the cost of not checking is bugs, debugging etc rather then any immediate measurable slowness. Static type checking can reduce certain types of bugs without causing runtime slowdown.

@sylee957
Copy link
Member Author

sylee957 commented Aug 26, 2023

@moorepants
I encourage to take a look at another review though.
Even though forcing the community to develop in static typing may not be be opt in,
providing users with type stubs should be a responsibility to do, without exception
regardless of whether the Python itself is duck typing language, almost of all its stdlib has type annotations.

@moorepants
Copy link
Member

When you come to a first complete draft, I'll review again. Thanks for championing this PEP.

@certik
Copy link
Member

certik commented Aug 26, 2023

@sylee957 do you think you could please put in the proposal the discussion of using "S" instead of "Expr" for the type of a symbolic expression?

Also, to mention the possibility of using Python compilers that these annotations might allow. However, we need to work together on settling on a type system style, this is some research work to be done, and I am not asking that we do it now. All I am asking is that the door is not closed today to this effort in the future.

See my comment above.

@sylee957
Copy link
Member Author

https://github.com/sympy/SymPEPs/blob/9bf56ed1d3bee09cfa807f6974d24e3ff6ae32c3/SymPEP-XXXX.md#using-s-for-sympy-objects

Okay, I try to add that part.
But let me know if I misunderstood something

@certik
Copy link
Member

certik commented Aug 26, 2023

@sylee957 very good, thanks! I think that summarizes it well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants